|
|
|
| Home | Wayback Machine | Archive-It | Blog | Heritrix |
| Anonymous User (login or join us) |
The Kasabi data publishing platform created by Talis was announced to be closing on July 30, 2012. While the service has only been around for ~2 years it represents a unique look at services for Linked Data, and contains a variety of datasets. In a subsequent post Kasabi announced the availability of a spreadsheet that lists where datasets can be downloaded from Amazon S3. This spreadsheet has been uploaded to Internet Archive as datasets.csv, and each referenced dataset has been uploaded as well.
The database snapshots are gzipped ntriples and gzipped n-quads. The nquads are different because they include a named graph URL for the dataset. For datasets that have a named graph URL the graph was downloaded, and saved using the dataset name with the ".nt" extension. For example prelinger-archives.gz has its named graph stored as prelinger-archives.nt.
The code used to download the datasets from s3 and upload to Internet Archive is available on GitHub
This item is part of the collection: Archive Team
Identifier: kasabi
Publicdate: 2012-07-17 18:42:31
Mediatype: web
Addeddate: 2012-07-17 18:42:31
Rights: Each dataset contained in the Kasabi Archive has a separate license. The license information is present in the uploaded datasets.csv.
Keywords: data; rdf; linked data
| Information | Format | Size |
| kasabi_files.xml | Metadata | [file] |
| kasabi_meta.xml | Metadata | 2.1 KB |
| Other Files | GZIP | Unknown | Comma-Separated Values |
| adelaide-metro.gz |
35.9 MB
|
||
| adelaide-metro.nt |
9.3 KB
|
||
| adventureworks-2008r2lt.gz |
182.0 KB
|
||
| aim25omp.gz |
1.6 KB
|
||
| aim25omp.nt |
5.6 KB
|
||
| airlines.gz |
638.9 KB
|
||
| airlines.nt |
4.2 KB
|
||
| airports.gz |
9.5 MB
|
||
| airports.nt |
12.5 KB
|
||
| aligaballa.gz |
20.0 B
|
||
| archon.gz |
535.1 KB
|
||
| archon.nt |
9.8 KB
|
||
| bbc-music.gz |
18.3 MB
|
||
| bbc-music.nt |
7.0 KB
|
||
| bbc-programmes.gz |
520.4 MB
|
||
| bbc-programmes.nt |
16.9 KB
|
||
| bbc-wildlife.gz |
1.1 MB
|
||
| bbc-wildlife.nt |
19.0 KB
|
||
| bbc.gz |
536.4 MB
|
||
| bbc.nt |
36.6 KB
|
||
| bergen-open-research-archive.gz |
1.5 MB
|
||
| bergen-open-research-archive.nt |
4.5 KB
|
||
| beyond-book.gz |
132.4 KB
|
||
| blackberry-bold-9900-batteries.gz |
20.0 B
|
||
| brandweer-amsterdam-amstelland-dispatch-messages.gz |
2.2 MB
|
||
| brandweer-amsterdam-amstelland-dispatch-messages.nt |
6.7 KB
|
||
| bricklink-similar-sets.gz |
1.0 MB
|
||
| bricklink-similar-sets.nt |
3.3 KB
|
||
| bricklink.gz |
20.0 MB
|
||
| bricklink.nt |
9.1 KB
|
||
| brighton-footprints.gz |
2.2 KB
|
||
| bruce-whealton-foaf-profile.gz |
11.4 KB
|
||
| bruce-whealton-foaf-profile.nt |
6.4 KB
|
||
| bryans-training-data.gz |
1.7 KB
|
||
| bryans-training-data.nt |
7.8 KB
|
||
| calil-library-data.gz |
286.3 KB
|
||
| calil-library-data.nt |
3.0 KB
|
||
| calling.gz |
2.0 MB
|
||
| calling.nt |
4.5 KB
|
||
| cars.gz |
577.5 KB
|
||
| cars.nt |
4.8 KB
|
||
| cdk-cito.gz |
3.8 KB
|
||
| cdk-cito.nt |
2.0 KB
|
||
| chembl-rdf.gz |
256.7 MB
|
||
| chemicalcontent.gz |
13.5 MB
|
||
| chempedia-rdf.gz |
20.0 B
|
||
| cia-world-factbook-ng.gz |
835.9 KB
|
||
| cia-world-factbook-ng.nt |
13.3 KB
|
||
| climb-dataincubator.gz |
2.2 MB
|
||
| climb-similar-routes.gz |
2.0 MB
|
||
| climb-similar-routes.nt |
3.2 KB
|
||
| colegios-de-chile.gz |
697.3 KB
|
||
| contrib-brighton.gz |
370.0 B
|
||
| copac-locah-project.gz |
30.0 B
|
||
| countries.gz |
3.8 KB
|
||
| countries.nt |
2.8 KB
|
||
| datasets.csv |
31.5 KB
|
||
| dbpedia-links.gz |
328.0 MB
|
||
| dbpedia-links.nt |
47.1 KB
|
||
| demostracion.gz |
876.0 B
|
||
| dendritic-cell-research-dc-research-eu.gz |
315.9 KB
|
||
| dev8d-2012.gz |
26.2 KB
|
||
| dev8d-2012.nt |
8.6 KB
|
||
| digital-city-brighton.gz |
1.6 MB
|
||
| digital-city-brighton.nt |
20.5 KB
|
||
| discogs.gz |
1.4 GB
|
||
| discogs.nt |
8.0 KB
|
||
| e1.gz |
870.4 KB
|
||
| e2.gz |
1.5 KB
|
||
| ecco-tcp-eighteenth-century-collections-online-texts.gz |
489.2 KB
|
||
| ecco-tcp-eighteenth-century-collections-online-texts.nt |
9.7 KB
|
||
| echo-european-commission-humanitarian-aid-and-civil-protection.gz |
21.4 KB
|
||
| echo-european-commission-humanitarian-aid-and-civil-protection.nt |
26.5 KB
|
||
| educacion.gz |
1.8 MB
|
||
| education-ordnance-survey-postcode-linkset.gz |
3.1 MB
|
||
| education-ordnance-survey-postcode-linkset.nt |
3.5 KB
|
||
| ejemplo.gz |
827.0 B
|
||
| ejemplo2.gz |
588.0 B
|
||
| ejemplo3.gz |
585.0 B
|
||
| ejemplo5.gz |
573.0 B
|
||
| elsevier-sponsored-documents.gz |
11.1 MB
|
||
| elsevier-sponsored-documents.nt |
6.5 KB
|
||
| english-heritage.gz |
84.9 MB
|
||
| english-heritage.nt |
8.3 KB
|
||
| english-language-books-listed-printed-book-auction-catalogues-17th-century-holland.gz |
755.5 KB
|
||
| eumida.gz |
808.8 KB
|
||
| eumida.nt |
50.8 KB
|
||
| eurobarometer-standard.gz |
9.0 MB
|
||
| european-election-results.gz |
20.6 KB
|
||
| european-election-results.nt |
8.5 KB
|
||
| eventseer.gz |
20.0 B
|
||
| exquisitely-refined-ipad-3-cases.gz |
20.0 B
|
||
| farmers-market.gz |
662.5 KB
|
||
| farmers-market.nt |
17.1 KB
|
||
| federal-reserve-economic-data.gz |
99.2 MB
|
||
| federal-reserve-economic-data.nt |
6.2 KB
|
||
| fixmystreet.gz |
7.9 MB
|
||
| food.gz |
74.9 MB
|
||
| food.nt |
23.6 KB
|
||
| foodista.gz |
14.5 MB
|
||
| foodista.nt |
5.6 KB
|
||
| gac-similar-works.gz |
1.1 MB
|
||
| gac-similar-works.nt |
3.1 KB
|
||
| gairola-dev-1.gz |
418.0 B
|
||
| gairola-dev-1.nt |
3.3 KB
|
||
| games.gz |
110.0 B
|
||
| games.nt |
2.0 KB
|
||
| genealogy-information.gz |
180.8 KB
|
||
| genealogy-information.nt |
9.7 KB
|
||
| geonames.gz |
481.8 MB
|
||
| geonames.nt |
2.7 KB
|
||
| geospecies.gz |
15.0 MB
|
||
| geospecies.nt |
22.5 KB
|
||
| global-hunger-index.gz |
155.0 KB
|
||
| global-hunger-index.nt |
7.3 KB
|
||
| government-art-collection.gz |
4.1 MB
|
||
| government-art-collection.nt |
17.6 KB
|
||
| hampshire-postcodes.gz |
10.4 MB
|
||
| hampshire-postcodes.nt |
51.2 KB
|
||
| hishamtestdataset.gz |
730.0 B
|
||
| iati.gz |
33.2 MB
|
||
| iati.nt |
3.7 KB
|
||
| icdb.gz |
2.0 MB
|
||
| icdb.nt |
7.3 KB
|
||
| italian-schools-kasabi.gz |
22.9 KB
|
||
| italian-schools.gz |
20.0 B
|
||
| italy.gz |
27.5 KB
|
||
| italy.nt |
8.6 KB
|
||
| jisc-cetis-project-directory.gz |
943.1 KB
|
||
| john-peel-archive.gz |
874.9 KB
|
||
| john-peel-archive.nt |
4.9 KB
|
||
| john-peel-sessions.gz |
2.3 MB
|
||
| john-peel-sessions.nt |
6.6 KB
|
||
| jpm-allanbeck-nasa-training.gz |
1.4 KB
|
||
| jpm-allanbeck-nasa-training.nt |
6.2 KB
|
||
| jpm-hs-nasa-training.gz |
595.0 B
|
||
| jpm-hs-nasa-training.nt |
3.1 KB
|
||
| kasabi-data-collector-test.gz |
187.0 B
|
||
| kasabi-directory.gz |
96.4 KB
|
||
| kasabi-space-training-database.gz |
20.0 B
|
||
| keith-test1.gz |
20.0 B
|
||
| keiths-space-training-dataset.gz |
776.0 B
|
||
| languages.gz |
235.5 KB
|
||
| languages.nt |
3.0 KB
|
||
| latc-eu-media.gz |
11.1 MB
|
||
| latc-eu-media.nt |
6.6 KB
|
||
| latc-linksets.gz |
11.2 MB
|
||
| latc-linksets.nt |
2.5 KB
|
||
| latc-metadata.gz |
20.0 B
|
||
| lichfield-spending-data.gz |
677.7 KB
|
||
| linked-movie-database.gz |
42.2 MB
|
||
| locah.gz |
3.6 MB
|
||
| lotico.gz |
28.9 KB
|
||
| lotico.nt |
4.7 KB
|
||
| maomava.gz |
20.0 B
|
||
| marc-codes-list.gz |
144.9 KB
|
||
| marctest.gz |
144.0 B
|
||
| marctest.nt |
2.0 KB
|
||
| medline.gz |
32.5 MB
|
||
| medline.nt |
7.0 KB
|
||
| metoffice-weather-forecasts.gz |
197.4 MB
|
||
| monumentos.gz |
7.9 KB
|
||
| moseley-folk-festival-data.gz |
13.4 KB
|
||
| mot-testing-stations.gz |
3.5 MB
|
||
| mot-testing-stations.nt |
4.2 KB
|
||
| mta-new-york-city-transit.gz |
26.1 MB
|
||
| mta-new-york-city-transit.nt |
10.1 KB
|
||
| musicbrainz-ng.gz |
578.9 KB
|
||
| musicbrainz-ng.nt |
13.6 KB
|
||
| musicnet.gz |
2.8 MB
|
||
| musicnet.nt |
4.1 KB
|
||
| myfoaf.gz |
9.5 KB
|
||
| myfoaf.nt |
7.5 KB
|
||
| nasa.gz |
1.9 MB
|
||
| near.gz |
??B
|
||
| new-york-times.gz |
3.8 MB
|
||
| new-york-times.nt |
3.4 KB
|
||
| nhs-ae-activity-statistics.gz |
4.0 MB
|
||
| nhs-ae-activity-statistics.nt |
9.9 KB
|
||
| nhs-hospital-activity-statistics.gz |
546.1 KB
|
||
| nhs-hospital-activity-statistics.nt |
8.9 KB
|
||
| nhs-organization.gz |
27.7 MB
|
||
| nhs-organization.nt |
26.9 KB
|
||
| nhs-performance-data.gz |
421.5 MB
|
||
| nhs-performance-data.nt |
49.2 KB
|
||
| nokia-lumia-900-case-case-nokia-lumia-900.gz |
20.0 B
|
||
| nptdr-rail.gz |
8.7 MB
|
||
| nptdr-rail.nt |
3.6 KB
|
||
| nra-women.gz |
1.8 MB
|
||
| nra.gz |
1.4 MB
|
||
| nxp-documentation.gz |
614.4 KB
|
||
| nxp-products.gz |
922.7 KB
|
||
| ofertaformativa.gz |
17.9 KB
|
||
| ontologiaejemplo.gz |
20.0 B
|
||
| ookaboo-20120206-beta.gz |
462.9 MB
|
||
| ookaboo-20120206-beta.nt |
4.2 KB
|
||
| openflights-airlines.gz |
443.7 KB
|
||
| openflights-airlines.nt |
3.2 KB
|
||
| openflights-routes.gz |
3.3 MB
|
||
| openflights-routes.nt |
5.3 KB
|
||
| ordnance-survey-linked-data.gz |
347.8 MB
|
||
| ordnance-survey-our-airports-linkset.gz |
12.9 KB
|
||
| ordnance-survey-our-airports-linkset.nt |
2.8 KB
|
||
| ordnance-survey-oxpoints-linkset.gz |
12.1 KB
|
||
| ordnance-survey-oxpoints-linkset.nt |
2.8 KB
|
||
| ordnance-survey-postcode-renewable-energy-generators-linkset.gz |
53.5 KB
|
||
| ordnance-survey-postcode-renewable-energy-generators-linkset.nt |
4.1 KB
|
||
| ordnance-survey-renewable-energy-generators-linkset.gz |
42.5 KB
|
||
| ordnance-survey-renewable-energy-generators-linkset.nt |
3.4 KB
|
||
| ordnance-survey-towns-latitude-longitude.gz |
60.7 KB
|
||
| ordnance-survey-towns-latitude-longitude.nt |
4.2 KB
|
||
| ordnance-survey-traffic-scotland-linkset.gz |
68.6 KB
|
||
| ordnance-survey-traffic-scotland-linkset.nt |
3.1 KB
|
||
| ordnance-survey-uk-education-linkset.gz |
2.5 MB
|
||
| ordnance-survey-uk-education-linkset.nt |
3.0 KB
|
||
| oreilly-opmi.gz |
15.3 MB
|
||
| oreilly-opmi.nt |
6.7 KB
|
||
| our-airports.gz |
8.1 MB
|
||
| oxpoints-geographic-qrcodes.gz |
240.6 KB
|
||
| oxpoints-geographic-qrcodes.nt |
2.7 KB
|
||
| oxpoints.gz |
135.3 KB
|
||
| oxpoints.nt |
13.3 KB
|
||
| pali-english-lexicon.gz |
4.0 MB
|
||
| pali-english-lexicon.nt |
4.9 KB
|
||
| panini-stickers.gz |
27.1 KB
|
||
| philippines-festival-0.gz |
20.0 B
|
||
| philippines-festival.gz |
20.0 B
|
||
| phrendpoints.gz |
1.8 KB
|
||
| phrendpoints.nt |
8.7 KB
|
||
| plants-1.gz |
49.2 KB
|
||
| plos-one.gz |
3.8 MB
|
||
| plos-one.nt |
3.2 KB
|
||
| pokedex-data-rdf.gz |
349.5 KB
|
||
| pranto-0.gz |
20.0 B
|
||
| prelinger-archives.gz |
7.2 MB
|
||
| prelinger-archives.nt |
10.4 KB
|
||
| prueba.gz |
121.0 B
|
||
| prueba.nt |
2.0 KB
|
||
| r4d-aid-data.gz |
21.2 MB
|
||
| rasika-athukorala.gz |
20.0 B
|
||
| rendimiento-escolar-chile.gz |
1.9 MB
|
||
| renewable-energy-generators.gz |
774.2 KB
|
||
| renewable-energy-generators.nt |
9.1 KB
|
||
| richards-training-data.gz |
1.1 KB
|
||
| richards-training-data.nt |
7.6 KB
|
||
| rnews-sampler.gz |
6.2 KB
|
||
| rnews-sampler.nt |
3.7 KB
|
||
| safecast-0.gz |
2.1 KB
|
||
| safecast-0.nt |
4.3 KB
|
||
| sanskrit-english-lexicon.gz |
10.8 MB
|
||
| sanskrit-english-lexicon.nt |
4.7 KB
|
||
| scotland-data.gz |
1.6 KB
|
||
| scotland-data.nt |
2.8 KB
|
||
| scottish-mountaineering-council-journals-issues-1-36.gz |
195.4 KB
|
||
| scottish-national-parks-and-postcode-areas.gz |
17.6 KB
|
||
| scottish-national-parks-and-postcode-areas.nt |
4.8 KB
|
||
| sembookshelf.gz |
375.5 KB
|
||
| sembookshelf.nt |
4.3 KB
|
||
| semiconductor-test-set-2.gz |
322.5 KB
|
||
| semiconductor-test-set-2.nt |
52.2 KB
|
||
| skills.gz |
18.3 KB
|
||
| skills.nt |
3.1 KB
|
||
| southampton-postcodes.gz |
3.2 MB
|
||
| southampton-postcodes.nt |
45.5 KB
|
||
| space-exercise.gz |
540.0 B
|
||
| study.gz |
20.0 B
|
||
| test-2.gz |
182.0 B
|
||
| test-2.nt |
2.0 KB
|
||
| test-6.gz |
2.5 KB
|
||
| test-6.nt |
9.2 KB
|
||
| test1.gz |
1,005.0 B
|
||
| test1.nt |
2.0 KB
|
||
| testdataset.gz |
1.6 KB
|
||
| testlocations.gz |
33.1 KB
|
||
| testlocations.nt |
10.4 KB
|
||
| tito.gz |
820.0 B
|
||
| toilet-information-sabae-city-fukui-prefecture-japan.gz |
3.0 KB
|
||
| traffic-scotland.gz |
1.8 MB
|
||
| turismo-asturias.gz |
82.0 KB
|
||
| twitter-data-eswc-conference.gz |
555.0 KB
|
||
| ubbspecial.gz |
4.4 MB
|
||
| uk-education-geographic-qrcodes.gz |
41.9 MB
|
||
| uk-education-geographic-qrcodes.nt |
2.8 KB
|
||
| uk-transport-operators.gz |
11.1 KB
|
||
| uk-transport-operators.nt |
3.6 KB
|
||
| un-hazardous-material-numbers.gz |
173.8 KB
|
||
| un-hazardous-material-numbers.nt |
5.8 KB
|
||
| unitedstatesgeographyowl.gz |
9.4 KB
|
||
| unitedstatesgeographyowl.nt |
5.7 KB
|
||
| uow-archives.gz |
37.9 MB
|
||
| uow-archives2.gz |
709.0 B
|
||
| uri.gz |
438.0 B
|
||
| uri.nt |
2.5 KB
|
||
| us-geography.gz |
20.0 B
|
||
| utpl.gz |
154.0 B
|
||
| utpl.nt |
1.9 KB
|
||
| veille-pearltrees.gz |
138.9 KB
|
||
| veille-pearltrees.nt |
6.3 KB
|
||
| view.gz |
4.1 KB
|
||
| void.gz |
4.7 KB
|
||
| whealton-family-genealogy-data.gz |
167.3 KB
|
||
| whealton-family-genealogy-data.nt |
6.4 KB
|
||
| wombra-scottish-technical-standards-section-6-energy.gz |
5.1 KB
|
||
| wombra-scottish-technical-standards-section-6-energy.nt |
16.6 KB
|
||
| world-air-travel.gz |
10.4 MB
|
||
| world-air-travel.nt |
15.4 KB
|
||
| world-geography.gz |
21.5 MB
|
||
| world-geography.nt |
5.7 KB
|
||
| www2012.gz |
333.3 KB
|
||
| yahoo-geoplanet.gz |
277.8 MB
|
||
| yahoo-geoplanet.nt |
17.5 KB
|
||
| yokohama-attractions.gz |
62.4 KB
|
||
| yokohama-attractions.nt |
3.6 KB
|
||
| yokohama-city-budget-plan-fy2012.gz |
21.9 KB
|
||
| zach-foaf.gz |
832.0 B
|
||
| zach-foaf.nt |
3.1 KB
|